I liked this blog post by Eevee.
He wrote about interesting things regarding random distributions, and linked to this page which describes a weird distribution implemented as Rnz in the NetHack game.
Note: I never heard of any of those before today.
I wanted to implement and experiment with the Rnz distribution myself.
Its code (see here) uses two other distributions, Rne and Rn2.
In [41]:
%load_ext watermark
%watermark -v -m -p numpy,matplotlib
In [39]:
import random
import numpy as np
import matplotlib.pyplot as plt
Rn2 distributionThe Rn2 distribution is simply an integer uniform distribution, between $0$ and $x-1$.
In [19]:
def rn2(x):
return random.randint(0, x-1)
In [20]:
np.asarray([rn2(10) for _ in range(100)])
Out[20]:
Testing for rn2(x) == 0 gives a $1/x$ probability :
In [32]:
from collections import Counter
In [35]:
Counter([rn2(10) == 0 for _ in range(100)])
Out[35]:
In [36]:
Counter([rn2(10) == 0 for _ in range(1000)])
Out[36]:
In [37]:
Counter([rn2(10) == 0 for _ in range(10000)])
Out[37]:
Rne distributionThe Rne distribution is a truncated geometric distribution.
In [88]:
def rne(x, truncation=5):
truncation = max(truncation, 1)
tmp = 1
while tmp < truncation and rn2(x) == 0:
tmp += 1
return tmp
In the NetHack game, the player's experience is used as default value of the
truncationparameter...
In [89]:
np.asarray([rne(3) for _ in range(50)])
Out[89]:
In [90]:
plt.hist(np.asarray([rne(3) for _ in range(10000)]), bins=5)
Out[90]:
In [91]:
np.asarray([rne(4, truncation=10) for _ in range(50)])
Out[91]:
In [92]:
plt.hist(np.asarray([rne(4, truncation=10) for _ in range(10000)]), bins=10)
Out[92]:
Let's check what this page says about rne(4):
The rne(4) call returns an integer from 1 to 5, with the following probabilities:
Number Probability 1 3/4 2 3/16 3 3/64 4 3/256 5 1/256
In [96]:
ref_table = {1: 3/4, 2: 3/16, 3: 3/64, 4: 3/256, 5: 1/256}
ref_table
Out[96]:
In [99]:
N = 100000
table = Counter([rne(4, truncation=5) for _ in range(N)])
for k in table:
table[k] /= N
table = dict(table)
table
Out[99]:
In [111]:
rel_diff = lambda x, y: abs(x - y) / x
for k in ref_table:
x, y = ref_table[k], table[k]
r = rel_diff(x, y)
print(f"For k={k}: relative difference is {r:.3g} between {x:.3g} (expectation) and {y:.3g} (with N={N} samples).")
Seems true !
In [112]:
def rnz(i, truncation=10):
x = i
tmp = 1000
tmp += rn2(1000)
tmp *= rne(4, truncation=truncation)
flip = rn2(2)
if flip:
x *= tmp
x /= 1000
else:
x *= 1000
x /= tmp
return int(x)
In [113]:
np.asarray([rnz(3) for _ in range(100)])
Out[113]:
In [114]:
np.asarray([rnz(3, truncation=10) for _ in range(100)])
Out[114]:
In [115]:
np.asarray([rnz(350) for _ in range(100)])
Out[115]:
In [122]:
_ = plt.hist(np.asarray([rnz(350) for _ in range(100000)]), bins=200)
In [78]:
np.asarray([rnz(350, truncation=10) for _ in range(100)])
Out[78]:
In [120]:
_ = plt.hist(np.asarray([rnz(350, truncation=10) for _ in range(10000)]), bins=200)